Project - Aritificial Neural Networks - Part 1 (Regression)

CONTEXT:

A communications equipment manufacturing company has a product which is responsible for emitting informative signals. Company wants to build a machine learning model which can help the company to predict the equipment’s signal quality using various parameters.

DATA DESCRIPTION:

The data set contains information on various signal tests performed:

  1. Parameters: Various measurable signal parameters.
  2. Signal_Quality: Final signal strength or quality

PROJECT OBJECTIVE:

The need is to build a regressor which can use these parameters to determine the signal strength or quality as number.

STRATEGY:

  1. Use a multi stage strategy to tune hyperparameters of the ANN using an opensource search automation package. We have chosen Optuna for this purpose.
  2. Develop a web app playground to test different combination of values for hyperparameters and visulise the training and testing process using a Liveplot.

(1) Import all Python Libraries

(2) Data loading and verification

Observation:

  1. Parameter 1 to Parameter 11 are independent variables.
  2. Signal Strength is the independent variable.

(2.1) Data Description

Observations:

  1. Variables Parameter 1 to Parameter 11 are of data type float.
  2. Signal Strength is of datatype integer

Observations:

  1. Parameter 8 has a very short range of 0.001 and may not add significant variance to the overall data set.
  2. Similarly Parameter 5 has a very small interquartile range and with some significant outliers to the right of the curve.
  3. Signal Strength is a discrete integer between 3 to 8.

(2.2) Data Verification

Observations:

  1. There are no NAN or NULL in the given dataframe. No additional cleansing required.

(3) EDA ( Exploratory Data Analysis)

(3a) - Univariate Analysis

Observations:

  1. Parameter 5 and Parameter 8 have very short IQR.
  2. If Outliers are eliminated, then the overall variance in the data would reduce.
  3. It might be a good to consisder refactoring these variables as a ratio or into a higher degree exponent in the feature engineering.

(3b) Multivariate Analysis

Observations:

1.There are significant correlations between most of the variables. 2.Most significant correlations are between Parameter 1 and Parameter 3, Parameter 1 and Parameter 8, Parameter 6 and Parameter 7 at almost 0.67. 3.Signal Strength is highly correlated with Parameter 10 and Parameter 11

(3c) Variance and Multicollinearity

Observations:

  1. Parameter 1, Parameter 4, Parameter 6, Parameter 7 and Parameter 11 are the top 5 variables showing high variance in the data.
  2. Parameter 8 shows nearly zero variance with Parameter 5 the next to least. These parameters could be discarded during the modelling

Observations:

  1. Parameter 8, 9, 11,1,10 and 2 have very high VIFs i.e multi-collienarity.
  2. We will attempt at reducing this during feature engineering.

4) Creating Helper Classes for Model creation and Live plotting

(4a) Class to create and customize DNNs.

This is a generic helper class to create N layered ANN with a flexibility to place BatchNormalization, Activation and Dropout layers.

(4b) Callback Class for Live plotting during training

This is a helper class to Live Plot the training progress during modelling

5 Feature Engineering and Feature Selection

(5a) Pre Modelling Baseline

Baselining strategy:

  1. We will create a baseline model to assess the performance with just the scaled data, with feature engineering and with feature selection. 2.This baseline model/data will then be subjected to different levels of hyperparameter optimization.
  2. To baseline the model, we will assume some basic hyperparameters as show below.
  3. Loss and Performance metric for regression problem will Mean Absolute Error (MAE)
1> Without Feature Engineering

Observation:

  1. An ANN model is created with batchNormalization, Activation and Dropout layers in that order.
  2. The model uses the defaults used in the helper class for ANN model creation
2> Feature Engineering
We will change the original variables into quadratic variables `(degree 2)` and drop `Parameter 8` as it's very close to `1` and increasing the power will not create any significant variance.
Observation:
  1. We see that the VIF is significantly reduced from the original value.
Observation:
  1. We can see that feature engineered data yields better result when compared to the previous experiment.
3> Feature Selection
Observation:
  1. The model based feature selection has returned 3 variables Parameter 11, 2 and 10. This has siginificantly reduced the feature space.
  2. Suspect that this might reduce the performance of the model.

Observation:

  1. Surprisingly the loss and performance is better than the previous 2 experiments as shown in the dashboard. This means that the 3 out of 11 parameters are sufficient with a descent performance.
Observations:

Based on the score board above, Loss/Metric is the least for the data with feature selection and hence we will use this as a baseline for our hyper parameter tuning and model training

6 Hyperparameter Tuning using Optuna.

https://optuna.readthedocs.io/en/stable/

Hyperparameter tuning strategy:

  1. Architecture Selection : Find the optimal number of layers, number of neurons per layer, optimal combination of Activation, BatchNormalization and Dropout.
  2. Coarse Tuning : Find the optimal weight initialization parameters, activation type and drop rate at each layer.
  3. Fine Tuning : Find the optimal optimizer and their parameters.
  4. Final Tuning : Find the optimal number of epochs and batch size

At each stage the parameter determined in the previous stage flows into the next one to override defaults

(6.1) Architecture Selection by hyperparameter tuning

Observation:

  1. The best recorded loss is 0.398 which is less than that using defaults. But this requires 7 hidden layers with neurons as listed above.

Observations:

  1. The parallel coordinate plot shows different combination of values taken to reach the best objective value of 0.39

(6.2) Coarse tuning by hyper parameter tuning

Observation:

  1. The coarse tuning has resulted in a marginal degradation of 0.001 with relu as the activation function and uniform weight initialiser.
Reform the best params construct to feed into the next strategy

(6.3) Fine tuning - Hunt for optimizer

Observations:

  1. The fine tuning has resulted in 0.388 which is further reduction from the previous stage by almost 0.01 with Adam optimizer and learning rate of 0.0003.

(6.4) Final tuning by hyper parameter tuning

Observation:

  1. The best performance at this stage is 0.433 i.e decreased by 0.1 at 500 epochs and batch size of 100. Hence we may not pick these values and use default values of 10 and 50 for batch_size and epochs respectively

(6.5) Testing model with tuned hyper parameters and manual tuning

Overriding epoch and batchsize obtained from the final tuning with the default params. Increasing the epoch to 100 to get more stable MAE.

(6.6) Consolidating the hyperparameter list to be used in GUI

(7) Conclusion:

  1. An ANN model was balined against scaled, feature engineered and feature selected data. It was found that the baseline against the Model Based Feature Selection was better than the other two.
  2. MAE is used as a loss function and performance metric in training and testing the ANN.
  3. The entire dataset was split into Train and Test. validation data was split from the training data at the time of model fitting.
  4. A multi stage strategy was applied in tuning hyper parameters of the ANN.

    Stage 1: Architecture Selection - MAE of 0.39887

    Stage 2: Coarse tuning - MAE of 0.399832

    Stage 3 : Fine tuning - MAE of 0.38822

    Satge 4 : Final tuning - MAE of 0.439973

    However, all hyperparameters prior to final tuning were retained and defaults applied to stage 4 parameters.

  5. A web based application based on Streamlit an opensource is created to test the hyperparameters and to save the model. You can try the same @ http://34.70.58.49:8085/. Please make sure to upload the data file and copy paste the hyperparameters above in the respective text input area.